Dynamic Binary Parallelization
نویسنده
چکیده
A large and important base of existing software is being left behind by emerging microprocessor architectures. Recently, fundamental issues in microprocessor technologies have led designers to increase the number of cores on a chip instead of increasing its single-threaded performance. Many-core designs with 4 to 8 cores are ubiquitous, and trends suggest that core counts will continue to grow for the foreseeable future [29, 36]. Unfortunately, most existing software is designed for single-core processors, and is therefore unable to fully exploit the increased processing power offered by many-core processors. This existing software base represents years and sometimes decades of investment. One solution to the problem is program parallelization; however, state-of-the-art parallelization technologies are not always practical for existing software. Many existing techniques require source code to be rewritten using parallel languages [15, 33] or libraries [35, 88], but this is often impractical due to cost: efforts to analyze, fix, and test existing software due to the Y2K bug alone were estimated to have cost about $20 billion in the 1990’s [78], and rewriting code to find opportunities for parallelism would be a much larger task. Alternatively, automatic parallelization techniques do not require code to be rewritten, but they typically do require access to the source code for analysis. In many cases, all or some of the source code and development tool chain may be lost or, in the case of third-party software, never available. Furthermore, software systems often involve components written in different programming languages, which makes cross-module parallelization difficult, if not impossible. Some parallelization techniques do not require source code and analyze the binary executable directly [19, 89, 96], but even these techniques are applied statically and so cannot parallelize across dynamically linked executables and libraries, which are not known until run time and can change or be upgraded. To address these problems, the proposed research will answer the question:
منابع مشابه
Potential of Dynamic Binary Parallelization
As core counts continue to grow in modern microarchitectures, automatic parallelization technologies are becoming increasingly important to fill the gap between hardware that has increased parallelism and software that is still designed for sequential execution. In previous research, we have proposed a novel dynamic binary parallelization scheme called T-DBP, which leverages hot traces to provi...
متن کاملTrace-Based Dynamic Binary Parallelization
With the number of cores increasing rapidly but the performance per core increasing slowly at best, software must be parallelized in order to improve performance. Manual parallelization is often prohibitively time-consuming and error-prone (especially due to data races and memory-consistency complexities), and some portions of code may simply be too difficult to understand or refactor for paral...
متن کاملFeasibility of Dynamic Binary Parallelization
This paper proposes DBP, an automatic technique that transparently parallelizes a sequential binary executable while it is running. A prototype implementation in simulation was able to increase sequential execution speeds by up to 1.96x, averaged over three benchmarks suites.
متن کاملDynamic Parallelization and Vectorization of Binary Executables on Hierarchical Platforms
As performance improvements are being increasingly sought via coarse-grained parallelism, established expectations of continued sequential performance increases are not being met. Current trends in computing point toward platforms seeking performance improvements through various degrees of parallelism, with coarse-grained parallelism features becoming commonplace in even entry-level systems. Ye...
متن کاملDynamic and Speculative Polyhedral Parallelization of Loop Nests Using Binary Code Patterns
Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use of dynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly ...
متن کاملParallel Sorting and Star-P Data Movement and Tree Flattening
This thesis studies three problems in the field of parallel computing. The first result provides a deterministic parallel sorting algorithm that empirically shows an improvement over two sample sort algorithms. When using a comparison sort, this algorithm is 1-optimal in both computation and communication. The second study develops some extensions to the Star-P system [7, 6] that allows it to s...
متن کامل